Implementing Efficient MPI on LAPI for IBM RS/6000 SP Systems: Experiences and Performance Evaluation

نویسندگان

  • Mohammad Banikazemi
  • Rama Govindaraju
  • Robert Blackmore
  • Dhabaleswar K. Panda
چکیده

The IBM RS/6000 SP system is one of the most costeffective commercially available high performance machines. IBM RS/6000 SP systems support the Message Passing Interface standard (MPI) and LAPI. LAPI is a low level, reliable and efficient one sided communication API library, implemented on IBM RS/6000 SP systems. This paper explains how the high performance of the LAPI library has been exploited in order to implement the MPI standard more efficiently than the existing MPI. It describes how to avoid unnecessary data copies at both the sending and receiving sides for such an implementation. The resolution of problems arising from the mismatches between the requirements of the MPI standard and the features of LAPI is discussed. As a result of this exercise, certain enhancements to LAPI are identified to enable an efficient implementation of MPI on LAPI. The performance of the new implementation of MPI is compared with that of the underlying LAPI itself. The latency (in polling and interrupt modes) and bandwidth of our new implementation is compared with that of the native MPI implementation on RS/6000 SP systems. The results indicate that the MPI implementation on LAPI performs comparably or better than the original MPI implementation in most cases. Improvements of up to in polling mode latency, in interrupt mode latency, and in bandwidth are obtained for certain message sizes. The implementation of MPI on top of LAPI also outperforms the native MPI implementation for the NAS Parallel Benchmarks. It should be noted that the implementation of MPI on top of LAPI is not a part of any IBM product and no assumptions should be made regarding its availability as a product.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance and Experience with LAPI - a New High-Performance Communication Library for the IBM RS/6000 SP

LAPI is a low-level, high-performance communication interface available on the IBM RS/6000 SP system. It provides an activemessage-like interface along with remote memory copy and synchronization functionality. It is designed primarily for use by experienced programmers in developing parallel subsystems, libraries and tools, but we also expect power programmers to use it in end-user application...

متن کامل

Performance Evaluation and Modeling of Reduction Operations on the IBM RS/6000 SP Parallel Computer

We discuss algorithms for global reduction (or combine) operations (e.g., global sums) for numbers of processors that need not be a power of 2, and implement these using standard message-passing techniques on distributed-memory parallel computers. We present performance results measured on an IBM RS/6000 SP parallel computer at UNIC. Signiicant performance improvements are obtained by using a r...

متن کامل

Benchmark Evaluation of the Message-Passing Overhead on Modern Parallel Architectures

The paper presented was inspired by an interesting investigation about the performance of MPI on an IBM RS/6000 SP machine [1]. The authors proposed a model for the evaluation of message-passing overhead and suggested to have an evaluation of message-passing performance on as many hardware platforms as possible. In some further investigations such evaluation was extended to other parallel platf...

متن کامل

A comparison of MPI performance on

Since MPI 1] has become a standard for message-passing on distributed memory machines a number of implementations have evolved. Today there is an MPI implementation available for all relevant MPP systems, a number of which is based on MPICH 2]. In this paper we are going to present performance comparison for several implementations of MPI on diierent MPPs. Results for the Cray T3E, the IBM RS/6...

متن کامل

Newton Two-stage Parallel Iterative Methods for Nonlinear Problems

Two-stage parallel Newton iterative methods to solve nonlinear systems of the form F (x) = 0 are introduced. These algorithms are based on the multisplitting technique and on the two-stage iterative methods. Convergence properties of these methods are studied when the Jacobian matrix F ′(x) is either monotone or an H-matrix. Furthermore, in order to illustrate the performance of the algorithms ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999